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Abstract 

In the static analysis of functional programs, pushdown flow anal- 
ysis and abstract garbage collection skirt just inside the boundaries 
of soundness and decidability. Alone, each method reduces analy- 
sis times and boosts precision by orders of magnitude. This work 
illuminates and conquers the theoretical challenges that stand in the 
way of combining the power of these techniques. The challenge in 
marrying these techniques is not subtle: computing the reachable 
control states of a pushdown system relies on limiting access dur- 
ing transition to the top of the stack; abstract garbage collection, 
on the other hand, needs full access to the entire stack to compute 
a root set, just as concrete collection does. Introspective pushdown 
systems resolve this conflict. Introspective pushdown systems pro- 
vide enough access to the stack to allow abstract garbage collection, 
but they remain restricted enough to compute control-state reacha- 
bility, thereby enabling the sound and precise product of pushdown 
analysis and abstract garbage collection. Experiments reveal syn- 
ergistic interplay between the techniques, and the fusion demon- 
strates "better-than-both-worlds" precision. 

Categories and Subject Descriptors D.3.4 [Programming lan- 
guages}: Processors — Optimization; F.3.2 [Logics and Meanings 
of Programs]: Semantics of Programming Languages — Program 
analysis. Operational semantics 

General Terms Languages, Theory 

Keywords CFA2, pushdown systems, abstract interpretation, 
pushdown analysis, program analysis, abstract machines, abstract 
garbage collection, higher-order languages 



1. Introduction 

The recent development of a context-fre^ approach to control- 
flow analysis (CFA2) by Vardoulakis and Shivers has provoked a 

' As in context-free language, not context-sensitivity. 
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seismic shift in the static analysis of higher-order programs 1221 . 
Prior to CFA2, a precise analysis of recursive behavior had been 
a stumbling block — even though flow analyses have an important 
role to play in optimization for functional languages, such as flow- 
driven inlining |,13J , interprocedural constant propagation 1,19] and 
type-check elimination 1231 . 

While it had been possible to statically analyze recursion soundly, 
CFA2 made it possible to analyze recursion precisely by matching 
calls and returns without approximation. In its pursuit of recursion, 
clever engineering steered CFA2 just shy of undecidability. The 
payoff is an order-of-magnitude reduction in analysis time and an 
order-of-magnitude increase in precision. 

For a visual measure of the impact. Figure [T| renders the abstract 
transition graph (a model of all possible traces through the pro- 
gram) for the toy program in Figure|2] For this example, pushdown 
analysis eliminates spurious return-flow from the use of recursion. 
But, recursion is just one problem of many for flow analysis. For 
instance, pushdown analysis still gets tripped up by the spurious 
cross-flow problem; at calls to (id f) and (id g) in the previous 
example, it thinks (id g) could be f or g. 

Powerful techniques such as abstract garbage collection 1141 
were developed to solve the cross-flow probleml^Jln fact, abstract 
garbage collection, by itself, also delivers orders-of-magnitude im- 
provements to analytic speed and precision. (See Figure[T]again for 
a visualization of that impact.) 

It is natural to ask: can abstract garbage collection and pushdown 
anlysis work together? Can their strengths be multiplied? At first 
glance, the answer appears to be a disheartening No. 



1.1 The problem: The whole stack versus just the top 

Abstract garbage collections seems to require more than push- 
down analysis can decidably provide: access to the full stack. Ab- 
stract garbage collection, like its name implies, discards unreach- 
able values from an abstract store during the analysis. Like con- 
crete garbage collection, abstract garbage collection also begins its 
sweep with a root set, and like concrete garbage collection, it must 
traverse the abstract stack to compute that root set. But, pushdown 



^ The cross-flow problem arises because monotonicity prevents revoking a 
judgment like "procedure f flows to x," or "procedure g flows to x," once 
it's been made. 



(define (id x) x) 



(define (f n) 

(cond [(<= n 1) 1] 

[else (* n (f (- n 1)))])) 




(1) without pushdown analysis or abstract GC: 653 states 




(2) with pushdown only: 139 states 




(3) with GC only: 105 states 




(4) with pushdown analysis and abstract GC: 77 states 



Figure 1. We generated an abstract transition graph for the same 
program from Figure |2] four times: (1) without pushdown analysis 
or abstract garbage collection; (2) with only abstract garbage col- 
lection; (3) with only pushdown analysis; (4) with both pushdown 
analysis and abstract garbage collection. With only pushdown or 
abstract GC, the abstract transition graph shrinks by an order of 
magnitude, but in different ways. The pushdown-only analysis is 
confused by variables that are bound to several different higher- 
order functions, but for short durations. The abstract-GC-only is 
confused by non- tail-recursive loop structure. With both techniques 
enabled, the graph shrinks by nearly half yet again and fully recov- 
ers the control structure of the original program. 



(define (g n) 

(cond [(<= n 1) 1] 

[else (+ (* n n) (g (- n 1)))])) 

(print (+ ((id f) 3) ((id g) 4))) 



Figure 2. A small example to illuminate the strengths and weak- 
nesses of both pushdown analysis and abstract garbage collection. 

systems are restricted to viewing the top of the stack (or a bounded 
depth) — a condition violated by this traversal. 

Fortunately, abstract garbage collection does not need to arbitrarily 
modify the stack. In fact, it does not even need to know the order of 
the frames; it only needs the set of frames on the stack. We find a 
richer class of machine — introspective pushdown systems — which 
retains just enough restrictions to compute reachable control states, 
yet few enough to enable abstract garbage collection. 

It is therefore possible to fuse the full benefits of abstract garbage 
collection with pushdown analysis. The dramatic reduction in ab- 
stract transition graph size from the top to the bottom in Figure [T] 
(and echoed by later benchmarks) conveys the impact of this fusion. 

Secondary motivations There are three strong secondary motiva- 
tions for this work: (I) bringing context-sensitivity to pushdown 
analysis; (2) exposing the context-freedom of the analysis; and (3) 
enabling pushdown analysis without continuation passing style. 

In CFA2, monovariant (OCFA-like) context-sensitivity is etched 
directly into the abstract semantics, which are in turn, phrased in 
terms of an explicit (imperative) summarization algorithm for a 
partitioned continuation-passing style. 

In addition, the context-freedom of the analysis is buried implicitly 
inside this algorithm. No pushdown system or context-free gram- 
mar is explicitly identified. A necessary precursor to our work was 
to make the pushdown system in CFA2 explicit. 

A third motivation was to show that a transformation to continuation- 
passing style is unnecessary for pushdown analysis. In fact, push- 
down analysis is arguably more natural over direct-style programs. 

1.2 Overview 

We first review preliminaries to set a consistent feel for terminology 
and notation, particularly with respect to pushdown systems. The 
derivation of the analysis begins with a concrete CESK-machine- 
style semantics for A-Normal Form A-calculus. The next step is an 
infinite-state abstract interpretation, constructed by bounding the 
C(ontrol), E(nvironment) and S(tore) portions of the machine while 
leaving the stack — the K(ontinuation) — unbounded. A simple shift 
in perspective reveals that this abstract interpretation is a rooted 
pushdown system. 

We then introduce abstract garbage collection and quickly find that 
it violates the pushdown model with its traversals of the stack. To 
prove the decidability of control-state reachability, we formulate in- 
trospective pushdown systems, and recast abstract garbage coUec- 



tion within this framework. We then show that control-state reach- 
ability is decidable for introspective pushdown systems as well. 

We conclude with an implementation and empirical evaluation that 
shows strong synergies between pushdown analysis and abstract 
garbage collection, including significant reductions in the size of 
the abstract state transition graph. 

1.3 Contributions 

We make the following contributions: 

1. Our primary contribution is demonstrating the decidability of 
fusing abstract garbage collection with pushdown flow analysis 
of higher-order programs. Proof comes in the form of a fixed- 
point solution for computing the reachable control-states of an 
introspective pushdown system and an embedding of abstract 
garbage collection as an introspective pushdown system. 

2. We show that classical notions of context-sensitivity, such as 
fc-CFA and poly/CFA, have direct generalizations in a push- 
down setting: monovarianc^ is not an essential restriction, as 
in CFA2. 

3. We make the context-free aspect of CFA2 explicit: we clearly 
define and identify the pushdown system. We do so by starting 
with a classical CESK machine and systematically abstracting 
until a pushdown system emerges. We also remove the orthogo- 
nal frame-local-bindings aspect of CFA2, so as to directly solely 
on the pushdown nature of the analysis. 

4. We remove the requirement for CPS-conversion by synthesiz- 
ing the analysis directly for direct-style (in the form of A- 
normal form lambda-calculus). 

5. We empirically validate claims of improved precision on a suite 
of benchmarks. We find synergies between pushdown analysis 
and abstract garbage collection that makes the whole greater 
that the sum of its parts. 



2. Pushdown preliminaries 

The literature contains many equivalent definitions of pushdown 
machines, so we adapt our own definitions from Sipser [20|. Read- 
ers familiar with pushdown theory may wish to skip ahead. 

2.1 Syntactic sugar 

When a triple (a;, I, x') is an edge in a labeled graph: 

^ ^ ^/ {x i x'^ 

Similarly, when a pair (x, x') is a graph edge: 

We use both string and vector notation for sequences: 
a\a2 . . . a„ = (ai, 02, . . . , a„) = a. 

2.2 Stack actions, stack cliange and stack manipulation 

Stacks are sequences over a stack alphabet P. To reason about stack 
manipulation concisely, we first turn stack alphabets into "stack- 



action" sets; each character represents a change to the stack: push, 
pop or no change. 

For each character 7 in a stack alphabet F, the stack-action set r± 
contains a push character 7+; a pop character 7_; and a no-stack- 
change indicator, e: 

g £ r± ::= e [stack unchanged] 

7+ for each 7 G F [pushed 7] 

7_ foreach7Gr [popped 7]. 

In this paper, the symbol g represents some stack action. 

When we develop introspective pushdown systems, we are going 
to need formalisms for easily manipulating stack-action strings and 
stacks. Given a string of stack actions, we can compact it into a 
minimal string describing net stack change. We do so through the 
operator [-J : F!j- — >■ r!|-, which cancels out opposing adjacent 
push-pop stack actions: 

Lff 7+7- 'J = 3 'J [5 e 'J = L? 3 'J. 

so that lg\ = g, if there are no cancellations to be made in the 
string g. 



We can convert a net string back into a stack by stripping off the 
push symbols with the stackify operator, [■] : F^ ^ F*: 

[7+7+ . . .7^"^! = {7'"\ • • • ,7',7). 

and for convenience, [g] — [ [ff J ] . Notice the stackify operator is 
defined for strings containing only push actions. 



2.3 Puslidown systems 

A puslidown system is a triple PI — {Q, F, S) where: 

1. Q is a finite set of control states; 

2. F is a stack alphabet; and 

3. S (Z Q X T± X Q is a transition relation. 

The set Q x F* is called the configuration-space of this pushdown 
system. We use PDS to denote the class of all pushdown systems. 

For the following definitions, let AI = {Q, F, S). 

• The labeled transition relation (1 — >m) C (Q x F*) x F± x 
(Q X F*) determines whether one configuration may transition 
to another while performing the given stack action: 

(g, 7) ((?', 7) iff <? ^ g' G ^ [no change] 

'y— 

(g,7 : 7) (g'.T^) iffg^g' G [pop] 

(g,7) (g',7 : 7) iffg^g' G ^ [push]. 

• If unlabelled, the transition relation (1 — >) checks whether any 
stack action can enable the transition: 

c I — > c iff c 1-^ c for some stack action q. 



^ Monovariance refers to an abstraction that groups all bindings to the same 
variable together: there is one abstract variant for all bindings to each 
variable. 



For a string of stack actions gi . . .gn 

CO I — > C„ ift Co I — ^ Ci I > ■ 

M M M 

for some configurations co , . . . , c„ . 



gn- l ^ a-n 

M M 



For the transitive closure: 

c c' iff c c' for some action string q . 



2.4 Rooted pushdown systems 

A rooted pushdown system is a quadruple (Q, F, 5, go) in which 
(Q, F, 5) is a pushdown system and go G Q is an initial (root) state. 
RPDS is the class of all rooted pushdown systems. 

For a rooted pushdown system M — [Q, F, 5, go), we define the 
reachable-from-root transition relation: 



c I — c' iff (go , {) ) c and c i 



M M M 

In other words, the root-reachable transition relation also makes 
sure that the root control state can actually reach the transition. 

We overload the root-reachable transition relation to operate on 
control states: 



g^g'iff(g,7)^ 



(g', 7 ') for some stacks 7, 7 '. 



For both root-reachable relations, if we elide the stack-action label, 
then, as in the un-rooted case, the transition holds if there exists 
some stack action that enables the transition: 

q I — H- g' iff g 1 — ^ g' for some action g. 

M M ^ 

2.5 Computing reachability in pushdown systems 

A pushdown flow analysis can be construed as computing the root- 
reachable subset of control states in a rooted pushdown system, 
M = {Q,r,6,qo): 



g : go 



-t> g 



Reps et. al and many others provide a straightforward "summariza- 
tion" algorithm to compute this set 1 1 8 17 18 1. Our preliminary 
report also offers a reachability algorithm tailored to higher-order 
programs ID. 

2.6 Nondeterministic finite automata 

In this work, we will need a finite description of all possible stacks 
at a given control state within a rooted pushdown system. We will 
exploit the fact that the set of stacks at a given control point is a 
regular language. Specifically, we will extract a nondeterministic 
finite automaton accepting that language from the structure of a 
rooted pushdown system. A nondeterministic finite automaton 
(NFA) is a quintuple M = (Q, E, S, go, F): 

• Q is a finite set of control states; 

• E is an input alphabet; 

• 5 C Q X (E U {e}) X Q is a transition relation. 

• go is a distinguished start state. 

• -F C Q is a set of accepting states. 

We denote the class of all NFAs as 



3. Setting: A-Normal Form A-calculus 

Since our goal is analysis of higher-order languages, we operate on 
the A-calculus. To simplify presentation of the concrete and abstract 



c G Conf — Exp X Env x Store x Kant [configurations] 

p G Env — Var ^ Addr [environments] 

a G Store — Addr — >■ Clo [stores] 

clo G Clo = Lam X Env [closures] 

K G Kont — Frame* [continuations] 

^ G Frame — Var x Exp x Env [stack frames] 

a G Addr is an infinite set of addresses [addresses]. 

Figure 3. The concrete configuration-space. 



semantics, we choose A-Normal Form A-calculus. (This is a strictly 
cosmetic choice: all of our results can be replayed mutatis mutandis 
in the standard direct-style setting as well.) ANF enforces an order 
of evaluation and it requires that all arguments to a function be 
atomic: 

e G Exp ::= (let ( (« call)) e) [non-tail call] 



f,se£ Atom 
lam G Lam 
call G Call 



I call 
I * 

:= V j lam 
■- (A dv) e) 
:= (/ ^) 



u G Var is a set of identifiers 



[tail call] 
[return] 

[atomic expressions] 
[lambda terms] 
[applications] 
[variables]. 



We use the CESK machine of Felleisen and Friedman f5\ to specify 
a small-step semantics for ANF. The CESK machine has an explicit 
stack, and under a structural abstraction, the stack component of 
this machine directly becomes the stack component of a pushdown 
system. The set of configurations (Conf) for this machine has the 
four expected components (Figure[3]l. 

3.1 Semantics 

To define the semantics, we need five items: 

1. X : Exp — >■ Conf injects an expression into a configuration: 

CO =X(e) = (e, [],[],()). 

2. A : Atom x Env x Store Clo evaluates atomic expressions: 

A{lam, p, a) = {lam, p) [closure creation] 
A(v, p, a) = a{p{v)) [variable look-up]. 

3. (=>) C Conf X Conf transitions between configurations. 
(Defined below.) 

4. £ : Exp — ^ V (Conf) computes the set of reachable machine 
configurations for a given program: 

g{e) = {c:l{e) ^' c} . 

5. alloc : Var x Conf — >■ Addr chooses fresh store addresses 
for newly bound variables. The address-allocation function is 
an opaque parameter in this semantics, so that the forthcom- 
ing abstract semantics may also parameterize allocation. This 
parameterization provides the knob to tune the polyvariance 
and context-sensitivity of the resulting analysis. For the sake 
of defining the concrete semantics, letting addresses be natural 
numbers suffices, and then the allocator can choose the lowest 



unused address: 

Addr = N 

alloc{v, (e, p, a, k.)) = 1 + m.a.x{dom{a)). 

Transition relation To define the transition c => c', we need three 
rules. The first rule handle tail calls by evaluating the function into 
a closure, evaluating the argument into a value and then moving to 
the body of the closure's A-term: 



([(/ a;)},p,a,K) => (e,p",cr',«:), where 
([(A (v) e)lp')=A{f,p,a) 
a = allociv, c) 



" 'r V 1 
p = p \v 1-^ a\ 



a' = a[a i-)- A{x, p, a)]. 
Non-tail call pushes a frame onto the stack and evaluates the call: 



([(let ( (« call) ) e)J, p, a, k) => {call, p, a, {v,e, p) : k) . 
Function return pops a stack frame: 



{se, p, a, [v, e, p) : k) ^ (e, p", a' ,k,) , where 
a = alloc{v, c) 
p = p [v 1-^ a\ 
a' = a[a i->- A{s£, p, a)]. 

4. Pushdown abstract interpretation 

Our first step toward a static analysis is an abstract interpretation 
into an infinite state-space. To achieve a pushdown analysis, we 
simply abstract away less than we normally would. Specifically, 
we leave the stack height unbounded. 

Figure |4] details the abstract configuration-space. To synthesize it, 
we force addresses to be a finite set, but crucially, we leave the 
stack untouched. When we compact the set of addresses into a 
finite set, the machine may run out of addresses to allocate, and 
when it does, the pigeon-hole principle will force multiple closures 
to reside at the same address. As a result, we have no choice 
but to force the range of the store to become a power set in the 
abstract configuration-space. The abstract transition relation has 
components analogous to those from the concrete semantics: 

Program injection The abstract injection function I : Exp — > 
Conf pairs an expression with an empty environment, an empty 
store and an empty stack to create the initial abstract configuration: 

CO =X(e) = (e, [],[],()). 

Atomic expression evaluation The abstract atomic expression 
evaluator, A : Atom x Env x Store — >■ 'P{Clo), returns the value 
of an atomic expression in the context of an environment and a 
store; it returns a set of abstract closures: 



A(lam, p,o) — {{lam,p)} 
A{v,p,a) = a{p{v)) 



[closure creation] 
[variable look-up]. 



c £ Conf — Exp X Env x Store x Kont [configurations] 

p G Env = Var ^ Addr [environments] 

a £ Store = Addr Clo^ [stores] 

do G Clo = Lam x Env [closures] 

k G Kont — Frame [continuations] 

(f> G Frame — Var x Exp x Env [stack frames] 

a G Addr is a finite set of addresses [addresses]. 

Figure 4. The abstract configuration-space. 

Reachable configurations The abstract program evaluator £ : 
Exp — >■ Vi^Conf) returns all of the configurations reachable from 
the initial configuration: 



£ 



(e) = |c : i(e) ~>* cj- , 



Because there are an infinite number of abstract configurations, a 
naive implementation of this function may not terminate. 

Transition relation The abstract transition relation ('^) C 
Conf X Conf has three rules, one of which has become non- 
deterministic. A tail call may fork because there could be multiple 
abstract closures that it is invoking: 



([(/ s)], p, (T, (e,/3 , a, k), where 

([(A iv) e)lp')eA{,f,p,o) 
a = allociv, c) 
p = p 1— > aj 
a' = (T U [a I— >■ A{x, p, a)]. 
We define all of the partial orders shortly, but for stores: 
(o- U o'){a) = d{a) U d'{a). 

A non-tail call pushes a frame onto the stack and evaluates the call: 



(|(let ( {v call) ) e)], p, a, k) [call, p, d, (v, e, p) : k) . 
A function return pops a stack frame: 



(se, p, a, {v, e, p) : k) (e, p", a , k) , where 
a = alloc(v, c) 
p = p [v t-^ a\ 
a = a\j[at-^ A{se, p, a)]. 

Allocation: Polyvariance and context-sensitivity In the abstract 
semantics, the abstract allocation function alloc : Var x Conf 
Addr determines the polyvariance of the analysis. In a control- 
flow analysis, polyvariance literally refers to the number of abstract 
addresses (variants) there are for each variable. An advantage of 
this framework over CFA2 is that varying this abstract allocation 
function instantiates pushdown versions of classical flow analyses. 
All of the following allocation approaches can be used with the 



abstract semantics. The abstract allocation function is a parameter 
to the analysis. 

Monovariance: Pushdown OCFA Pushdown OCFA uses vari- 
ables themselves for abstract addresses: 



Addr = Var 
alloc(v, c) = V. 

Context-sensitive: Pushdown ICFA Pushdown ICFA pairs the 
variable with the current expression to get an abstract address: 



Addr = Var x Exp 
alloc(y, (e, p, a, k)) = [v, e). 

Polymorphic splitting: Pushdown poly/CFA Assuming we com- 
piled the program from a programming language with let-bound 
polymorphism and marked which functions were let-bound, we can 
enable polymorphic splitting: 



4.2 Soundness 

To prove soundness, an abstraction map a connects the concrete 
and abstract configuration-spaces: 

a(e, p, o", k) = {e,a{p),a{a),a{K)) 

a{p) — \v.a{p{v)) 

q(ct) = Xa. I I {a(a(a))} 



...,(t>n) = {OL{(t)\), 



a{a) — a 

{a((pi ),..., a{(l>„)) 
a{v,e,p) = {v,e,a{p)) 

a{a) is determined by the allocation functions. 



It is then easy to prove that the abstract transition relation simulates 
the concrete transition relation: 

Theorem 4.1. // 

q(c) C c and c => c, 
then there must exist c' £ Conf such that: 

q(c') C c and c c . 



alloc{v, ([(/ se)j,p,a, k)) 



Addr = Var + Var x Exp 

(^;,[(/;e)]) /islet-bound 



otherwise. 



Pushdown k-CFA For pushdown fc-CFA, we need to look beyond 
the current state and at the last k states. By concatenating the 
expressions in the last k states together, and pairing this sequence 
with a variable we get pushdown fc-CFA: 

Addr = Var x Exp* 

alloc{v, {{ei, pi,ai, ki), ...)) = {v, (ei, . . . ,6^)). 

4.1 Partial orders 

For each set X inside the abstract configuration-space, we use the 
natural partial order, (Ex) — ^ Abstract addresses and 
syntactic sets have flat partial orders. For the other sets, the partial 
order lifts: 

• point- wise over environments: 

p C p' iff p{v) = p{v) for all ii G dom{p); 

• component-wise over closures: 

[lam, p) □ (lam, p') iff p C p'; 

• point-wise over stores: 

(T C it' iff o{a) C ^"'(a) for all a G dom{a); 

• component-wise over frames: 

{y,e,p) E (ii, e,p') iff pCp; 

• element-wise over continuations: 

(01, ... , 4>n) C {(j>[,. . ., 4>'„) iff 4>^ C 4>'i; and 

• component- wise across configurations: 

(c, p, a, k) C (e, p', a , k') iff p ^ p and ct C a' and k C k' . 



Proof. The proof follows by case-wise analysis on the type of the 
expression in the configuration. It is a straightforward adaptation of 
similar proofs, such as that of 1 1 1 1 for fc-CFA. □ 



5. The shift: From abstract CESK to rooted PDS 

In the previous section, we constructed an infinite-state abstract 
interpretation of the CESK machine. The infinite-state nature of the 
abstraction makes it difficult to see how to answer static analysis 
questions. Consider, for instance, a control flow-question: 

At the call site (.fx), may a closure over lam be called? 

If the abstracted CESK machine were a finite-state machine, an 
algorithm could answer this question by enumerating all reach- 
able configurations and looking for an abstract configuration 
(|(/ a)], p, (T, k) in which (lam, _) G A{f, p, a). However, be- 
cause the abstracted CESK machine may contain an infinite number 
of reachable configurations, enumeration is not an option. 

Fortunately, a shift in perspective reveals the abstracted CESK 
machine to be a rooted pushdown system. This shift permits the 
use of a control-state reachability algorithm in place of exhaustive 
search of the configuration-space. In this shift, a control-state is 
an expression-environment-store triple, and a stack character is a 
frame. Figure [5] defines the program-to-RPDS conversion function 
VVS : Exp RPDS. 

At this point, we can compute the root-reachable control states 
using a straightforward summarization algorithm fTl inifTSl . This 
is the essence of CFA2. 



6. Introspection for abstract garbage collection 

Abstract garbage collection 1 14] yields large improvements in pre- 
cision by using the abstract interpretation of garbage collection to 
make more efficient use of the finite address space available during 
analysis. Because of the way abstract garbage collection operates, it 



VVSie) = (Q, r, 5, qo), where 
Q = Exp X Env x Store 
r = Frame 

{q, e, q ) G <5 iff {q, k) ~* {q , k) for all k 
{q, (f)-, q ) G S iff (q, (f) : k) {q' , k) for all k 
{q, q') G (5 iff {q, k) ~» (g', (j)' : k) for all k 
(go,())=i(e). 

Figure 5. 'PSS : Exp RPDS. 

grants exact precision to the flow analysis of variables whose bind- 
ings die between invocations of the same abstract context. Because 
pushdown analysis grants exact precision in tracking return-flow, 
it is clearly advantageous to combine these techniques. Unfortu- 
nately, as we shall demonstrate, abstract garbage collection breaks 
the pushdown model by requiring full stack inspection to discover 
the root set. 

Abstract garbage collection modifies the transition relation to con- 
duct a "stop-and-copy" garbage collection before each transition. 
To do this, we define a garbage collection function G : Conf —¥ 
Con] on configurations: 

c 

G(e, p, a, k) = (e, p, &\Reachable{c), k), 

where the pipe operation f\ S yields the function /, but with inputs 
not in the set S mapped to bottom — the emgty set. The reachability 
function Reachable : Conf — >• V{Addr) first computes the 
root set, and then the transitive closure of an address-to-address 
adjacency relation: 

c 

Reachable{e, p, a, /t) = |o : So £ Root{c) and ao -> , 

where the function Root : Conf — )• V{Addr) finds the root 
addresses: 

Root(e, p, a, k) = range{p) U StackRoot{k), 

and the StackRoot : Kont — >■ V{Addr) function finds roots down 
the stack; 

StackRoot{{vi,ei,pi), {v„, e„, = |^ range{pi), 

i 

and the relation (-i>) C Addr x Store x Addr connects adjacent 
addresses: 

a -o o' iff there exists (lam, p) G a (a) such that a G range{p). 

The new abstract transition relation is thus the composition of 
abstract garbage collection with the old transition relation: 

(-^Gc) = (^) ° G 

Problem: Stack traversal violates pushdown constraint In the 

formulation of pushdown systems, the transition relation is re- 
stricted to looking at the top frame, and even in less restricted for- 
mulations, at most a bounded number of frames can be inspected. 
Thus, the relation (-^gc) cannot be computed as a straightforward 
pushdown analysis using summarization. 



Solution: Introspective pushdown systems To accomodate the 
richer structure of the relation (-^gc), we now define introspec- 
tive pushdown systems. Once defined, we can embed the garbage- 
collecting abstract interpretation within this framework, and then 
focus on developing a control-state reachability algorithm for these 
systems. 

An introspective pusiidown system is a quadruple M = (Q, F, 5, go) : 

1. Q is a finite set of control states; 

2. r is a stack alphabet; 

3. (JCQxF* xr± xQisa transition relation; and 

4. go is a distinguished root control state. 

The second component in the transition relation is a realizable stack 
at the given control-state. This realizable stack distinguishes an 
introspective pushdown system from a general pushdown system. 
IPDS denotes the class of all introspective pushdown systems. 

Determining how (or if) a control state q transitions to a control 
state g', requires knowing a path taken to the state g. Thus, we 
need to define reachability inductively. When M = {Q,T,d,qo), 
transition from the initial control state considers only empty stacks: 

go giff (go, (},g,q) G S. 

M 

For non-root states, the paths to that state matter, since they deter- 
mine the stacks realizable with that state: 

g H- ^ g' iff there exists g such that go i— ^ q and (g, [g] ,g,q') G S, 

M M 

where g ^^ ?'"'>'f "^ q iff g gi • • • i > q . 



6.1 Garbage collection in introspective pushdown systems 

To convert the garbage-collecting, abstracted CESK machine into 
an introspective pushdown system, we use the function TVDS : 
Exp -> IPDS: 

fPVSie) = {Q,T,5,qo) 

Q = Exp X Env x Store 
r = Frame 
(g, k, e, q) G <5 iff G(g, k) (g, k) 
{q,4>: k,4>-,q') € S iff G(g,0 : k) {q',k) 
(g, k, 4>+, q) G <5 iff G(g, k) (g, 4> : k) 
(go,0)=i(e). 

7. Introspective reachability via Dyck state 
graphs 

Having defined introspective pushdown systems and embedded our 
abstract, garbage-collecting semantics within them, we are ready to 
define control-state reachability for IDPSs. 

We cast our reachability algorithm for introspective pushdown sys- 
tems as finding a fixed-point, in which wc incrementally accrete the 
reachable control states into a "Dyck state graph." 

A Dyck state graph is a quadruple G = {S, T, E, so), in which: 



1. S is a finite set of nodes; 

2. r is a set of frames; 

3. E <Z S X T± X 5 is a set of stack-action edges; and 

4. So is an initial state; 

such tiiat for any node s G S, it must be the case that: 
(so, ()) >— ^ (s, 7) for some stack 7. 

In other words, a Dyck state graph is equivalent to a rooted push- 
down system in which there is a legal path to every control state 
from the initial control stateQwe use DSG to denote the class of 
Dyck state graphs. (Clearly, DSG C RPDS.) 

Our goal is to compile an implicitly-defined introspective push- 
down system into an explicited-constructed Dyck state graph. Dur- 
ing this transformation, the per-state path considerations of an in- 
trospective pushdown are "baked into" the Dyck state graph. We 
can formalize this compilation process as a map, DSQ : IPDS — > 
DSG. 

Given an introspective pushdown system Al — (Q, F, 5, go), its 
equivalent Dyck state graph is T)SQ{M) — {S, F, E, go), where 
So ~ qo, the set S contains reachable nodes: 

S — |g : go ' — ^ q for some stack-action sequence g| , 

and the set E contains reachable edges: 

E={q>^q':q^q'}. 

Our goal is to find a method for computing a Dyck state graph from 
an introspective pushdown system. 



7.1 Compiling to Dyck state graphs 

We now turn our attention to compiling an introspective pushdown 
system (defined implicitly) into a Dyck state graph (defined explic- 
itly). That is, we want an implementation of the function T>SQ. To 
do so, we first phrase the Dyck state graph construction as the least 
fixed point of a monotonic function. This formulation provides a 
straightforward iterative method for computing the function T>SQ. 

The function T : IPDS (DSG DSG) generates the mono- 
tonic iteration function we need: 

T{M) = /, where 
M = (Q,r,5,go) 
f{S,V,E,so) = (S',r,£', so), where 

S' = S U |s' : s G 5 and s s'j U {so} 

S' = £ U |s ^ s' : s G 5" and s s'j . 

Given an introspective pushdown system M, each application of 
the function J-{M) accretes new edges at the frontier of the Dyck 
state graph. 



'*We chose the term Dyck state graph because the sequences of stack 
actions along valid paths through the graph con'espond to substrings in 
Dyck languages. A Dyck language is a language of balanced, "colored" 
parentheses. In this case, each character in the stack alphabet is a color. 



7.2 Computing a round of F 

The formalism obscures an important detail in the computation 
of an iteration: the transition relation (1 — y^) for the introspective 
pushdown system must compute all possible stacks in determining 
whether or not there exists a transition. Fortunately, this is not as 
onerous as it seems: the set of all possible stacks for any given 
control-point is a regular language, and the finite automaton that 
encodes this language can be lifted (or read off) the structure of 
the Dyck state graph. The function Stacks : DSG ^ 5* ^ NFA 
performs exactly this extraction: 

M 

Stacks{S, r, E, so)(s) — (5, F, S, sq, {s}), where 
(s',7, s") G (5if (s',7h-,s") G E 

(s', e, s") G 5 if s' I — ^ s" and [g\ — e. 

7.3 Correctness 

Once the algorithm reaches a fixed point, the Dyck state graph is 
complete: 

Theorem 7.1. VSg{M) = lip{T{M)). 

Proof. Let M = (Q, F, 5, go). Let / = T{M). Observe that 
lfp(/) = /"(0, r, 0, go) for some n. When N <Z M, then it easy 
to show that f(N) C M. Hence, VSg[M) D lfp(J'(M)). 

To show T>SQ{M) C lfp(J^(Af)), suppose this is not the case. 
Then, there must be at least one edge in T>SQ{M) that is not 
in lfp(J^(M)). By the defintion of T>Sg{M), each edge must be 
part of a sequence of edges from the initial state. Let (s, g, s') be 
the first edge in its sequence from the initial state that is not in 
lfp(J^(A/). Because the proceeding edge is in lfp(J^(M)), the state 
s is in lfp(J^(A/)). Let m be the lowest natural number such that s 
appears in f"^(AI). By the definition of /, this edge must appear in 
/'"+^(M), which means it must also appear in lfp(J^(Af )), which 
is a contradiction. Hence, T>Sg{M) C lfp(J^(A/")). □ 

7.4 Complexity 

While decidability is the goal, it is straightforward to determine 
the complexity of this naive fixed-point method. To determine the 
complexity of this algorithm, we ask two questions: how many 
times would the algorithm invoke the iteration function in the worst 
case, and how much does each invocation cost in the worst case? 
The size of the final Dyck state graph bounds the run-time of the 
algorithm. Suppose the final Dyck state graph has m states. In the 
worst case, the iteration function adds only a single edge each time. 
Between any two states, there is one e-edge, one push edge, or 
some number of pop edges (at most |F|). Since there are at most 
|F|m^ edges in the final graph, the maximum number of iterations 
is |F|m^. 

The cost of computing each iteration is harder to bound. The cost 
of determining whether to add a push edge is constant, as is the cost 
of adding an e-edge. So the cost of determining all new push edges 
and new e-edges to add is constant. Determining whether or not to 
add a pop edge is expensive. To add the pop edge s s', we 

must prove that there exists a configuration-path to the control state 
s, in which the character 7 is on the top of the stack. This reduces 
to a CFL-reachability query (9) at each node, the cost of which is 
0[\T±\^ni') |8|. 



To summarize, in terms of the number of reacliable control states, 
the complexity of this naive algorithm is: 

0{{\r\m^) X (|r±|^m^)) = o(\r\^m''). 

(As with summarization, it is possible to maintain a work-list and 
introduce an e-closure graph to avoid spurious recomputation. This 
ultimately reduces complexity to 0(|rpm^).) 



8. Implementation and evaluation 

We have developed an implementation to produce the Dyck state 
graph of an introspective pushdown system. While the fixed-point 
computation |7.2| could be rendered directly as functional code, ex- 
tending the classical summarization-based algorithm for pushdown 
reachability to introspective pushdown systems yields better per- 
formance. In this section we present a variant of such an algorithm 
and discuss results from an implementation that can analyze a large 
subset of the Scheme programming language. 



8.1 Iterating over a DSG: An implementor's view 

To synthesize a Dyck state graph from an introspective pushdown 
system, it is built incrementally — node by node, edge by edge. The 
naive fixed point algorithm presented earlier, if implemented liter- 
ally, would (in the worst case) have to re-examine the entire DSG to 
add each edge. To avoid such re-examination, our implementation 
adds e-summary edges to the DSG. 

In short, an e-summary edge connects two control states if there 
exists a path between them with no net stack change — that is, 
all pushes are cancelled by corresponding pops. With e-summary 
edges available, any change to the graph can be propagated directly 
to where it has an effect, and then any new e-summary edges that 
propagation implies are added. 

Whereas the correspondence between CESK and an IPDS is rela- 
tively straightforward, the relationship between a DSG and its orig- 
inal IPDS is complicated by the fact that the IPDS keeps track of 
the whole stack, whereas the DSG distributes (the same) stack in- 
formation throughout its internal structure. 

A classic reachability-based analysis for a pushdown system re- 
quires two mutually-dependent pieces of information in order to 
add another edge: 



1 . The topmost frame on a stack for a given control state q. This is 
essential for return transitions, as this frame should be popped 
from the stack and the store and the environment of a caller 
should be updated respectively. 

2. Whether a given control state q is reachable or not from the ini- 
tial state go along realizable sequences of stack actions. For ex- 
ample, a path from go to g along edges labeled "push, pop, pop, 
push" is not realizable: the stack is empty after the first pop, so 
the second pop cannot happen — let alone the subsequent push. 

These two data are enough for a classic pushdown reachability 
summarization to proceed one step further. However, the presence 
of an abstract garbage collector, and the graduation to an introspec- 
tive pushdown system, imposes the requirement for a third item of 
data: 



3. For a given control state g, what are all possible frames that 
could happen to be on the stack at the moment the IPDS is in 
the state g? 

It is possible to recompute these frames from scratch in each it- 
eration using the NFA-extraction technique we described. But, it 
is easier to maintain per-node summaries, in the same spirit as e- 
summary edges. 

A version of the classic pushdown summarization algorithm that 
maintains the first two items is presented in |4|, so we will just 
outline the key differences here. 

The crux of the algorithm is to maintain for each node g' in the 
DSG, a set of e-predecessors, i.e., nodes g, such that gi — y>-^,j q' 
and [g] — e. In fact, only two out of three kinds of transitions can 
cause a change to the set of e-predecessors for a particular node g: 
an addition of an e-edge or a pop edge to the DSG. 

It is easy to see why the second action might introduce new e-paths 
and, therefore, new e-predecessors. Consider, for example, adding 
the 7_-edge g g' into the following graph: 

T+ , 

qo ^ q q ^ qi 

As soon this edge drops in, there becomes an "implicit" e-edge 
between go and gi because the net stack change between them is 
empty; the resulting graph looks like: 



go 3- g ^ g ^ gi 

where we have illustrated the implicit e-edge as a dashed line. 

A little reflection on e-predecessors and top frames reveals a mutual 
dependency between these items during the construction of a DSG. 
Informally: 

• A top frame for a state g can be pushed as a direct predecessor, 
or as a direct predecessor to an e-predecessor. 

• When a new e-edge g A g' is added, all e-predecessors of g 
become also e-predecessors of g'. That is, e-summary edges are 
transitive. 

• When a 7_ -pop-edge g g' is added, new e-predecessors of 
a state gi can be obtained by checking if g' is an e-predecessor 
of gi and examining all existing e-predecessors of g, such that 
7+ is their possible top frame: this situation is similar to the one 
depicted in the example above. 

The third component — all possible frames on the stack for a state 
g — is straightforward to compute with e-predecessors: starting 
from g, trace out only the edges which are labeled e (summary 
or otherwise) or 7+. The frame for any action 7+ in this trace is 
a possible stack action. Since these sets grow monotonically, it is 
easy to cache the results of the trace, and in fact, propagate incre- 
mental changes to these caches when new e-summary or 7+ nodes 
are introduced. Our implementation directly reflects the optimiza- 
tions discussed above. 

8.2 Experimental results 

A fair comparison between different families of analyses should 
compare both precision and speed. We have extended an existing 
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Figure 6. Benchmark results. The first three columns provide the name of a benchmark, the number of expressions and variables in the 
program in the ANF, respectively. For each of eight combinations of pushdown analysis, fc G {0, 1} and garbage collection on or off, the first 
two columns in a group show the number of control states and transitions/DSG edges computed during the analysis (for both less is better). 
The third column presents the amount of singleton variables, i.e, how many variables have a single lambda flow to them (more is better). 
Inequalities for some results denote the case when the analysis did not finish within 30 minutes. For such cases we can only report an upper 
bound of singleton variables as this number can only decrease. 



implementation of fc-CFA to optionally enable pushdown analysis, 
abstract garbage collection or both. Our implementation source and 
benchmarks are available: 

[http : / /github . com/ ilyasergey/ reachability! 

As expected, the fused analysis does at least as well as the best 
of either analysis alone in terms of singleton flow sets (a good 
metric for program optimizability) and better than both in some 
cases. Also worthy of note is the dramatic reduction in the size 
of the abstract transition graph for the fused analysis — even on 
top of the already large reductions achieved by abstract gabarge 
collection and pushdown flow analysis individually. The size of the 
abstract transition graph is a good heuristic measure of the temporal 
reasoning ability of the analysis, e.g., its ability to support model- 
checking of safety and liveness properties 1121 . 

In order to exercise both well-known and newly-presented in- 
stances of CESK-based CFAs, we took a series of small bench- 
marks exhibiting archetypal control-flow patterns (see Figure |6j. 
Most benchmarks are taken from the CFA literature: mj 09 is a run- 
ning example from the work of Midtgaard and Jensen designed to 
exhibit a non-trivial return-flow behavior, eta and blur test com- 
mon functional idioms, mixing closures and eta-expansion, kef a2 
and kef a3 are two worst-case examples extracted from Van Horn 
and Mairson's proof of fc-CFA complexity |21|, loop2 is an ex- 
ample from the Might's dissertation that was used to demonstrate 
the impact of abstract GC [11, Section 13.3], sat is a brute-force 
SAT-solver with backtracking. 

8.2.1 Comparing precision 

In terms of precision, the fusion of pushdown analysis and abstract 
garbage collection substantially cuts abstract transition graph sizes 
over one technique alone. 

We also measure singleton flow sets as a heuristic metric for preci- 
sion. Singleton flow sets are a necessary precursor to optimizations 
such as flow-driven inlining, type-check elimination and constant 
propagation. Here again, the fused analysis prevails as the best-of- 
or better- than-both- worlds. 
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Figure 7. We ran our benchmark suite on a 2 Core 2.66 GHz OS X 
machine with 4 Gb RAM. For each of the four analyses the left 
column denotes the values obtained with no abstract collection, 
and the right one — with GC on. The results of the analyses are 
presented in minutes (') or seconds ("), where e means a value 
less than 1 second and oo stands for an analysis, which has been 
interrupted due to the an execution time greater than 30 minutes. 

Running on the benchmarks, we have revalidated hypotheses about 
the improvements to precision granted by both pushdown analy- 
sis |22| and abstract garbage collection 1 11 1. The table in Figure[6] 
contains our detailed results on the precision of the analysis. 

8.2.2 Comparing speed 

In the original work on CFA2, Vardoulakis and Shivers present 
experimental results with a remark that the running time of the 
analysis is proportional to the size of the reachable states 1221 
Section 6]. There is a similar coiTelation in the fused analysis, but 
it is not as strong or as absolute. From examination of the results, 
this appears to be because small graphs can have large stores inside 
each state, which increases the cost of garbage collection (and thus 
transition) on a per-state basis, and there is some additional per- 
transition overhead involved in maintaining the caches inside the 
Dyck state graph. Table |7] collects absolute execution times for 
comparison. 

It follows from the results that pure machine-style fc-CFA is always 
significantly worse in terms of execution time than either with 
GC or push-down system. The histogram on Figure [8] presents 
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Figure 8. Analysis times relative to worst (= 1) in class; smaller is better. On the left is the monovariant OCFA class of analyses, on the right 
is the poly variant ICFA class of analyses. (Non-GC fc-CFA omitted.) 



normalized relative times of analyses' executions. About half the 
time, the fused analysis is faster than one of pushdown analysis 
or abstract garbage collection. And about a tenth of the time, it is 
faster than bothrlWhen the fused analysis is slower than both, it 
is generally not much worse than twice as slow as the next slowest 
analysis. 

Given the already substantial reductions in analysis times provided 
by collection and pushdown anlysis, the amortized penalty is a 
small and acceptable price to pay for improvements to precision. 



9. Related work 

Garbage-collecting pushdown control-flow analysis draws on work 
in higher-order control-flow analysis H91 , abstract machines ^5J 
and abstract interpretation jS)- 



Context-free analysis of higher-order programs The motivating 
work for our own is Vardoulakis and Shivers very recent discovery 
of CFA2 1 22 1 . CFA2 is a table-driven summarization algorithm that 
exploits the balanced nature of calls and returns to improve return- 
flow precision in a control-flow analysis. Though CFA2 exploits 
context-free languages, context-free languages are not explicit in its 
formulation in the same way that pushdown systems are explicit in 
our presentation of pushdown flow analysis. With respect to CFA2, 
our pushdown flow analysis is also polyvariant/context-sensitive 
(whereas CFA2 is monovariant/context-insensitive), and it covers 
direct-style. 

On the other hand, CFA2 distinguishes stack-allocated and store- 
allocated variable bindings, whereas our formulation of pushdown 
control-flow analysis does not: it allocates all bindings in the store. 
If CFA2 determines a binding can be allocated on the stack, that 
binding will enjoy added precision during the analysis and is not 
subject to merging like store-allocated bindings. While we could 
incorporate such a feature in our formulation, it is not necessary 
for achieving "pushdownness," and in fact, it could be added to 
classical finite-state CFAs as well. 



^ The SAT-solving bechmark showed a dramatic improvement with the ad- 
dition of context-sensitivity. Evaluation of the results showed that context- 
sensitivity provided enough fuel to eliminate most of the non-determinism 
from the analysis. 



Calculation approach to abstract interpretation Midtgaard and 
Jensen |10| systematically calculate OCFA using the Cousot- 
Cousot-style calculational approach to abstract interpretation |2] 
applied to an ANF A-calculus. Like the present work, Midtgaard 
and Jensen start with the CESK machine of Flanagan et al. (6) and 
employ a reachable-states model. 

The analysis is then constructed by composing well-known Galois 
connections to reveal a OCFA incorporating reachability. The ab- 
stract semantics approximate the control stack component of the 
machine by its top element. The authors remark monomorphism 
materializes in two mappings: "one mapping all bindings to the 
same variable," the other "merging all calling contexts of the same 
function." Essentially, the pushdown OCFA of Section |4] corre- 
sponds to Midtgaard and Jensen's analysis when the latter map- 
ping is omitted and the stack component of the machine is not ab- 
stracted. 



CFL- and pushdown-reachability techniques This work also 
draws on CFL- and pushdown-reachability analysis fT][8l ll7lll8l . 
For instance, e-closure graphs, or equivalent variants thereof, ap- 
pear in many context-free-language and pushdown reachability 
algorithms. For our analysis, we implicitly invoked these methods 
as subroutines. When we found these algorithms lacking (as with 
their enumeration of control states), we developed Dyck state graph 
construction. 

CFL-reachability techniques have also been used to compute clas- 
sical finite-state abstraction CFAs |9| and type-based polymorphic 
control-flow analysis 1 16|. These analyses should not be confused 
with pushdown control-flow analysis, which is computing a fun- 
damentally more precise kind of CFA. Moreover, Rehof and Fah- 
ndrich's method is cubic in the size of the typed program, but the 
types may be exponential in the size of the program. Finally, our 
technique is not restricted to typed programs. 



Model-checking higher-order recursion schemes There is ter- 
minology overlap with work by Kobayashi 1 7 1 on model-checking 
higher-order programs with higher-order recursion schemes, which 
are a generalization of context-free grammars in which produc- 
tions can take higher-order arguments, so that an order-0 scheme 
is a context-free grammar. Kobyashi exploits a result by Ong 1151 
which shows that model-checking these recursion schemes is de- 
cidable (but ELEMENTARY-complete) by transforming higher- 
order programs into higher-order recursion schemes. 



Given the generality of model-checking, Kobayashi's technique 
may be considered an alternate paradigm for the analysis of higher- 
order programs. For the case of order-0, both Kobayashi's tech- 
nique and our own involve context-free languages, though ours is 
for control-flow analysis and his is for model-checking with respect 
to a temporal logic. After these surface similarities, the techniques 
diverge. In particular, higher-order recursions schemes are limited 
to model-checking programs in the simply-typed lambda-calculus 
with recursion. 



10. Conclusion 

Our motivation was to further probe the limits of decidability for 
pushdown flow analysis of higher-order programs by enriching it 
with abstract garbage collection. We found that abstract garbage 
collection broke the pushdown model, but not irreparably so. By 
casting abstract garbage collection in terms of an introspective 
pushdown system and synthesizing a new control-state reachability 
algorithm, we have demonstrated the decidability of fusing two 
powerful analytic techniques. 

As a byproduct of our formulation, it was also easy to demon- 
strate how polyvariant/context-sensitive flow analyses generalize 
to a pushdown formulation, and we lifted the need to transform to 
continuation-passing style in order to perform pushdown analysis. 

Our empirical evaluation is highly encouraging: it shows that the 
fused analysis provides further large reductions in the size of the 
abstract transition graph — a key metric for interprocedural control- 
flow precision. And, in terms of singleton flow sets — a heuristic 
metric for optimizability — the fused analysis proves to be a "better- 
than-both-worlds" combination. 

Thus, we provide a sound, precise and polyvariant introspective 
pushdown analysis for higher-order programs. 
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